An Approach to Named Entity Disambiguation Based on Explicit Semantics

نویسنده

  • Martin Jačala
چکیده

Identification of the named entities in an unstructured, human written text is a well established subtask of Natural Language Processing. However, marking each occurrence of a named entity in the text with a class label is usually not sufficient, as the same word often describe several different entities. A dedicated area of NLP, Named Entity Disambiguation, has been devised to solve this problem. In our work we present an approach to the problem of Named Entity Disambiguation based on the Explicit Semantic Analysis. We use a semantic similarity measure based on the similarity between context of the entity and the documents describing the possible meanings. We use an additional semantics provided by Wikipedia, such as disambiguation and redirect pages or links between the documents. Evaluation of the proposed method shows an improvement over the traditionally used Latent Semantic Analysis.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

A Novel Approach to Conditional Random Field-based Named Entity Recognition using Persian Specific Features

Named Entity Recognition is an information extraction technique that identifies name entities in a text. Three popular methods have been conventionally used namely: rule-based, machine-learning-based and hybrid of them to extract named entities from a text. Machine-learning-based methods have good performance in the Persian language if they are trained with good features. To get good performanc...

متن کامل

Towards large-scale, open-domain and ontology-based named entity classification1

Named entity recognition and classification research has so far mainly focused on supervised techniques and has typically considered only small sets of classes with regard to which to classify the recognized entities. In this paper we address the classification of named entities with regard to large sets of classes which are specified by a given ontology. Our approach is unsupervised as it reli...

متن کامل

Towards large-scale, open-domain and ontology-based named entity classification

Named entity recognition and classification research has so far mainly focused on supervised techniques and has typically considered only small sets of classes with regard to which to classify the recognized entities. In this paper we address the classification of named entities with regard to large sets of classes which are specified by a given ontology. Our approach is unsupervised as it reli...

متن کامل

Named Entity Disambiguation for German News Articles

Named entity disambiguation has become an important research area providing the basis for improving search engine precision and for enabling semantic search. Current approaches for the named entity disambiguation are usually based on exploiting structured semantic and lingual resources (e.g. WordNet, DBpedia). Unfortunately, each of these resources cover independently from each other insufficie...

متن کامل

Exploiting WordNet for Wikipedia-Based Named Entity Disambiguation

Entity disambiguation is an important problem in semantic analysis and natural language processing. In this paper, we propose an approach to employ features of the WordNet ontology in the task of disambiguating named entities to Wikipedia. Methods of enriching text with synonymous relations of words are explored. An analysis of the results from our experiments shows that the accuracy of the dis...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2011